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Abstract. Most statistical software packages implement numerical strategies 
for computation of maximum likelihood estimates in random effects models. 
Little is known, however, about the algebraic complexity of this problem. For 
the one-way layout with random effects and unbalanced group sizes, we give 
formulas for the algebraic degree of the likelihood equations as well as the equa- 
tions for restricted maximum likelihood estimation. In particular, the latter 
approach is shown to be algebraically less complex. The formulas are obtained 
by studying a univariate rational equation whose solutions correspond to the 
solutions of the likelihood equations. Applying techniques from computational 
algebra, we also show that balanced two-way layouts with or without interac- 
tion have likelihood equations of degree four. Our work suggests that algebraic 
methods allow one to reliably find global optima of likelihood functions of lin- 
ear mixed models with a small number of variance components. 



1. Introduction 

Linear models with fixed and random effects are widely used for dependent ob- 
servations. Such mixed models are typically fit using likelihood-based techniques, 
and the necessary optimization problems can be solved using the numerical meth- 
ods implemented in various statistical software packages, as discussed, for instance, 
in |Far06j . Such software typically takes into account that the variance parameters 
are nonnegative. However, general-purpose optimization procedures do not give any 
guarantees that a global optimum is found; compare Section 1.8 in [Jia07]. It can 
thus be appealing to compute maximum likelihood (ML) estimates algebraically. 
Since linear mixed models have rational likelihood equations, this involves careful 
clearing of denominators and applying symbolic and specialized numerical tech- 
niques to determine all solutions of the resulting polynomial system. An expla- 
nation of what we mean by careful clearing of denominators is given in [DSS09, 
Chap. 2]. While solving likelihood equations algebraically may not be feasible in 
large models with several random factors, modern computational algebra does allow 
one to fully understand the likelihood surface in practically relevant settings. 

The main contribution of this paper is a study of the algebraic complexity of 
ML estimation in the unbalanced one-way layout with random effects. This model 
concerns a collection of grouped observations 

(1) = fi + oti + £ij, i = l,...,q, j = l,...,rii. 

The overall mean /i 6 K is a fixed ('non-random') but unknown parameter. The 
random effects ai and the error terms e%j are mutually independent normal random 
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variables. More precisely, ai ~ A/"(0, r) and Sij ~ A/"(0, ui), where r and w denote the 
common variances of the random effects and the error terms, respectively. Clearly, 
the distribution of observation is A/"(/i, r + w), and two observations Y^ and 
lit from the ith group are dependent with covariance r. A detailed discussion 
and examples of applications of this specific model can be found, for instance, in 
Chapter 3 of [SCM92] and in Chapter 11 of [SO05] . 

The covariance matrix of the joint multivariate normal distribution for all Yy 
defined by ([TJ) is the product of the scalar w and a matrix that is a function of the 
variance ratio 9 = t/uj. Therefore, when 9 is known, the likelihood equations for 
fj, and u are of the type encountered in generalized least squares calculations, with 
a unique solution that is a rational function of the data and the known value of 9. 
We may thus eliminate /i and lo from the likelihood equations, which then reduce to 
a single univariate equation. Before turning to a first example, we remark that we 
always tacitly assume suitable sample size conditions to be satisfied such that ML 
estimates exist. In particular, we assume there to be q > 2 groups with at least one 
group of size ni > 2. A definitive answer to the existence problem in linear mixed 
models is given in [DM99] who also treat restricted maximum likelihood (REML) 
estimation; see |MN89j for an introduction to this technique. 

Example 1. Textbook data from [DG72, §6.4] give the yield of dyestuff from 
5 different preparations from each of q = 6 different batches of an intermediate 
product; the data are also available in the R package lme4. The layout is balanced, 
that is, all batch sizes are equal, here ni = 5. In this case, the likelihood equations 
are well-known to be equivalent to a linear equation system, and the ML estimators 
are rational functions of the observations Yij. In terminology we will use later on, 
balanced one-way layouts have ML degree one. Exactly the same is true for REML. 

A different picture emerges in the unbalanced case, when the batch sizes are not 
all equal. For illustration, we remove the first, second and sixth observation from 
the data. The first batch then only comprises n\ = 3 preparations, and the second 
batch only n-i = 4. The remaining batches are unchanged with ni = 5 for i > 3. 
In this unbalanced case, the solutions of the likelihood equations correspond to the 
solutions of the polynomial equation 

(2) - 245488320000 9 7 - 277109078400 9 6 - 58814614680 9 5 + 54052612853 9 4 
+ 37792395524 9 3 + 10086075110 9 2 + 1279832076 9 + 64175517 = 0. 

As the large integers suggest, this equation is exact given the input. However, the 
measurements that enter the computation are, of course, rounded. 

Numerical optimization using the R package lme4 yields a local maximum of the 
likelihood function that corresponds to 9 « 0.5585. We may check whether this 
local maximum is unique, or at least a global optimum, by finding all roots of the 
above univariate polynomial. This is a task that can be done reliably in computer 
algebra systems. However, such a computation is not needed here. The polynomial 
in ^ has exactly one sign change in its coefficient sequence. Hence, by Descartes' 
rule of signs, it has precisely one positive real root. The mere construction of the 
polynomial thus reveals that the local maximum we computed is the unique local 
(and global) maximum of the likelihood function; recall that the parameter 9 is 
restricted to be nonnegative. 
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A similar story unfolds for REML estimation. The only difference is that the 
degree of the relevant polynomial drops to five: 

(3) - 17047800000 9 5 - 6811774200 9 4 + 5505084700 9 3 

+ 4048254212 9 2 + 897954164 9 + 67458244 = 0. 

We note that equations {2J and |3]) cannot be solved by radicals. The Galois groups 
are the symmetric groups SV and S5, respectively. 

It is natural at this point to ask for the maximum likelihood degree of the one-way 
layout as a function of the number of groups q and the group sizes rii, . . . , n q . The 
ML degree is the number of complex solutions to the likelihood equations when 
the data are generic. Indeed, the number of complex solutions is constant with 
probability one, and a data set is generic if it is not part of the null set for which 
the number of complex solutions is different. The REML degree is defined in just 
the same way, but starting from different equations. Either degree measures the 
algebraic complexity of the computation of the estimates. In Example[T] it is simply 
the degree of a univariate polynomial in 9 = t/lj whose roots yield the possibly 
complex vectors (h,uj,t) that solve the likelihood equations. For more background 
on ML degrees, see |HKS05I ICHKS06I IBHR07I IDSS09I IStuOQI IHSIO] . 

Our main result answers the above question. TheoremQ] which we prove in later 
sections, gives formulas for both the ML and the REML degree of possibly unbal- 
anced one-way layouts and offers a direct comparison of the algebraic complexity 
of the two approaches. The theorem is conveniently stated using a notion of mul- 
tiplicities. Suppose v = (vi, . . . ,v q ) G IP is a tuple of integers. If v has M distinct 
entries, then the multiplicities of v form the integer multiset {mi, . . . , ttim}, where 
7ji j counts how often the jth distinct entry of v appears among all entries of v. 

Theorem 1. Consider a one-way layout with random effects for q groups that 
are of sizes n\, . . . ,n q . Suppose M of the group sizes are distinct, with associated 
multiplicities m\,...,rriM- Let M 2 = : rrij > 2}. Then the ML degree is 

3M + M 2 - 3, and the REML degree is 2M + 2M 2 - 3. The ML degree exceeds the 
REML degree unless M 2 = M, in which case equality holds. 

The condition M 2 = M holds if each group size appears at least twice. In the 
balanced case, we have M = M 2 = 1 and the theorem recovers the well-known 
fact that both degrees are one; compare |Hoc851 SCM92, [SO04] . Each degree is 
maximal when the group sizes n\, . . . ,n q are pairwise distinct. The degrees are 
then 3q - 3 for ML and 2q - 3 for REML. 

Example 2. The model for the dyestuff data from Example [T] has q — 6 groups. 
The unbalanced case we considered had group sizes (ni, . . . , n@) = (3, 4, 5, 5, 5, 5). 
The multiplicities are {1,1,4}. Our formulas confirm the ML and REML degree to 
be 3-3 + 1 — 3 = 7 and 2-3 + 2-1 — 3 = 5, respectively. As another example, if 
(ni, . . . , n 6 ) = (4, 4, 3, 2, 2, 2), then the ML degree is 8 and the REML degree is 7. 

The remainder of the paper is structured as follows. In Section [21 we review 
the derivation of the likelihood equations for ML and REML estimation. Section [3] 
contains the proof of the ML degree formula from Theorem [TJ and Section [4] treats 
the REML degree. Each proof consists of a detailed study of a univariate rational 
equation in the variance ratio 9. In Section [5l we demonstrate that algebraic 
computations are feasible for more general linear mixed models. More precisely, 
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we treat a one-way layout with q = 109 unbalanced groups and a mean structure 
given by two covariates that is relevant in a recent application. In Section [6j we 
consider balanced two-way layouts. These are known to have REML degree equal 
to one, and we show that the ML degree is four, which means that ML estimates 
are available in closed form in the sense of Cardano's formula. Our conclusions are 
summarized in Section [Jj where we also give two examples of unbalanced one-way 
random effects models with bimodal likelihood functions. 

2. The likelihood equations 

Let ni, . .., n M be unique group sizes with associated multiplicities mi, . . . , tum- 
Let Yij = (Yiji, . . . ,Yij Ui ) be the vector comprising the observations in the jth 
group of size rij. Then the model for the one-way layout given by (UJ can equiv- 
alently be described as stating that Yn , . . . , Y\ mi , Y%i , . . . , YMm M are independent 
multivariate normal random vectors with 

Yij ~ Af (/il ni , S ni (oj, r)) , 

where the covariance matrix is 

E ni (w, t) = u:I ni + rl„ 4 1^. . 

Here, 1„ = (1, . . . , 1) T € R n , and /„ is the n x n identity matrix. 

2.1. Maximum likelihood. Ignoring additive constants and multiplying by two, 
the log-likelihood function of the one-way model is 

M m, 

(4) i(p,u,r) = ^^logdet(X ni (w,r)) - - fil ni ) T K ni {u, r)(Y l3 - M l ni ), 
where 

(5) ff„ 4 (w,r) = -/ ni -- r 4 T 1 ^ 1 ™,- 

is the inverse of S ni (w,r). The inverse has determinant 

1 



(6) &et{K ni {u,T)) 

Let iV = rriiTii + • • • + itimtim be the total number of observations. For each 
i = 1, . . . , M, define the group averages 

1 n< 

Yij y Yijk, j i, . . . , 

Tli i! — ' 

1 fe=l 

and the average across the groups of equal size 

Yi / ^7 ■ 

From the averages, compute the between- group sum of squares 



i=i 
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Note that, for generic data, Bj = if and only if rrij = 1. Therefore, it suffices to 
consider the sums of squares Bi with to; > 2. Finally, define the within- group sum 
of squares 

M mi rii 

^ = £££(^-^) 2 , 

i=l j=l k=l 

which is positive for generic data. 

Proposition 1. Upon the substitution n = 1/co and 8 = t/u), the log-likelihood 
function for the one-way layout can be written as 



(7) i(fj,,K,0) = Nlog(K)-KW 



M 



^TO;log(l + ni6) 



1=1 



M 



i=l 



n, 



;B,, 



M 



E m i n i (v \2 



»=i 

Proof. Applying ([5]), the quadratic form in (jj) can be expanded into 
(Yij - /iln<) T 'Kni(u,T)(Yij - fJ,l n J 

= - £(^ fe - m) 2 - -t-4 — v - M Tl m] 

fc=i 



1 rti 



fe=l 



— — ■ rn?(Fij - /i) 2 

w(w + n;T) 



;(%-/i) 2 + «X;(l r «*-*<j) S 



fc=l 



1 + n,-6 

Using this expression and (|6|), the log- likelihood function is seen to be equal to 

(8) 1{h,k,6) 



N log(/t) - kW - 



A I 



£m, log(l + UiO) 



M m; 



_i=i j=i 

The claimed form of £(/i, k, #) is now obtained by expanding the last sum as 



(9) 



(10) 



1 + nS 



[{Yij ~ %f + & - A*) 2 + 2(F« - r 4 )(y< - M)] 



3=1 



l+rii 



;(.Y-vy + 



i + m 



-.Bi 



□ 
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The partial derivatives of the log-likelihood function from Proposition [T] are 

M 



0£ 



x - mini - 

4 — 1 



dt N 



M 



M 



-Bi 



(13) 



00 



A I 



,»=i 



- (1 + n, 

^ (1 + n^)2 ( ^ ~ M)2 + (1 + n^ 2 



M 9 

mint 



A I 



Since iV ^ 0, the equation system obtained by setting the three partials to zero has 
the same solution set as the equation system 



it/ 



(14) 
(15) 
(16) 



Y- 
^ i 



(% - m) = 0, 



N — K 



A I 



n,6 



T-B< 



- (1 + n 



St 



mini 



0. 



= 0. 



Now we can solve equation (fl4j) for /i, substitute the result into equation (fTSj) 
and solve for k. Both and k are then expressed in terms of 0. Substituting the 
expressions into (116[) . we obtain a univariate rational equation in 0. Our proof of the 
ML degree formula in Theorem Q] proceeds by cancelling terms from the numerator 
and denominator of this rational expression. This is the topic of Section |3l 

2.2. Restricted maximum likelihood. The REML method uses a slightly differ- 
ent likelihood function that is obtained by considering a projection of the observed 
random array (Yijk) £ K . The mean of this array has all entries equal to fj,. In 
other words, it is modelled to lie in the space £ C M. N spanned by the array with all 
entries equal to one. The likelihood function used in REML is obtained by taking 
the observation to be the projection of {Yijk) onto the orthogonal complement of 
C. The distribution of the projection no longer depends on /i and so the REML 
function only has (r, ui) or, equivalently, (k, 0) as arguments. 

Using the formulas given, for instance, in [MN89] . and simplifying the resulting 
expressions similar to what was done in the proof of Proposition [TJ we obtain the 
following expression for the restricted log-likelihood function. 

Proposition 2. Upon the substitution k = X/u> and = t/lo, the restricted log- 
likelihood function for the one-way layout can be written as 



(17) 6) = (N - 1) log(«) - kW 



A I 



A I 



m6 



AI 



»=i 



E m log(l + n t 0) 

' M 

Si 



i=i 
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with 
(18) 



E^E 



E 



«=1 1+Tl; 



i=l 2_/j=l l+n ; 6 



EM mini 
i=l l+n 4 6 



Note that /t(0) is the solution to the equation in (fT4|) . Computing jl(9) is the 
standard way to obtain an estimate of /x from a REML estimate of 9. 

The partial derivatives of the restricted log-likelihood function from Proposition^ 
are 



dl 

< 19 » i 



N - 1 



W 



M 



^ 1 

i=l 



M 



ti 1 



-Mi 



dl 

< 20 > § - - 



M 

\ TTiiTli 



EM min i 
i=l (l+m6) 



' M 

f-f (1 + rii6>r f—f (1 

2=1 X / 2=1 V 



mill 



1 (l+r 
2 



M 



71^) 2 



;B, 



The equation 81/ On = is easily solved. Substituting the unique solution into 
the equation dl/89 = yields again a univariate rational equation in 9. The proof 
of the REML degree formula in Theorem [1] requires studying cancellations from the 
numerator and denominator of this equation, which is the topic of Section 0] 



3. Proof of formula for ML degree 

Our proof of the ML degree formula in Theorem [T] proceeds in two steps. First, 
in Lemma Q] we derive a univariate rational equation whose number of zeros is the 
ML degree of the model. Second, we simplify it in Lemmas [2] and [3] by clearing 
common factors from the numerator and the denominator. 

Fix the following notation, used throughout. For a vector a = (ai, . . . , am) £ 
K M , define the rational functions 

r a(6') = > j — -a, and s a (9) 



f-f 1 + M 1 aw f-> (1 + n^) 2 *' 

2=1 2=1 

We write r\ , r B / m , ry , ry2 for the functions r a that have 

a=l M , o= (—,...,— — ) , a = (Yi, . . . , Y M ), a = (Y?, . . . , Y^), 
\mi m M J 

respectively. It is clear from Section [2] that forming a common denominator for the 
rational equations to be studied involves the product 

M 

d(6)=l[(l + n i 6) = d 1 (0)d 2 (9), 
»=i 

where 

di{o)= n ( i+n * e )> d ^ e )= n ( i + n * )- 

{i:TOi=l} {i:m;>2} 
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For a vector a £ M. M , define the degree M — 1 polynomial 

M 

f a (9) = d(9)r a (9) = ^m l n l a l [](l + n^) 

i=l j^i 

and the degree 2(M — 1) polynomial 

«7 O (0) - d(#) 2 s a (#) = ^ m.nfa, + ?i^) 2 . 

i=l j/i 

Lemma 1. The ML degree of the one-way layout is the degree of the numerator 
created when cancelling all common factors from numerator and denominator of 
the following rational function in 9: 

(21) NdWKW 

x (N [h{9) 2 g Y ,{8) - 2f Y (9)f 1 (9)g Y (9) + f Y (9) 2 9l (9) + h{9) 2 g B/m {9)] 

-fi(0) 2 [Wf 1 (9)d(9)+f Y ^0)fi(0)~f Y (0) 2 + fi(9)f B/m (9)]) . 

Proof. Adopting the notation above, the solution of the first of the likelihood equa- 
tions in (|T4")) can be written as 

(22) W) = 

n(0) 

Next, rewrite the following term from the system of the three critical equations: 
(23) 

M ,n\ M //ix9 M 



1 + n^t/ ri (t/) ' 1 + ^t/ n (p) 1 

■i— 1 v y t= 1 v ' ■— ' 



2 — 1 



n(0) • 

Solving the second equation in (| 15[) with /i = (x{6) for k thus gives 

iV 



^ + r y2 (0)+r B/m (0)-^gf 

Nn{6) 



(24) £(0) 

(25) ~ Wn (9) + r Y 2 (6)n (9) + n (9)r B/m (9) - r Y {9 f ' 

Substituting fi(9) and k(9) into the third and last equation in (|16j) . we obtain 
the univariate rational equation 

(26 ) SY<e) - 2^s Y (9) + ^s 1 (6) + s B/m {9) - ?M = 0, 

where we have divided by the non-zero rational expression k{9). According to (|24[) . 
this is 

— 7o\ Sy ^> H TOY2 

n{9) r 1 {9) 2 

Wn(9) + r Y 2(9) ri (9) + n(6)r B/m - r Y (9) 2 



(27) SY2 (9)-2^s Y (9) + ^ lSl (9)+ SB/r , 



N 



= 0. 
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Reexpress (|26|) in terms of the / and g polynomials as 



(28) 9Y2{9) 2 Me)9Y{9) • ^W 2 fliW , 9b/ 



d{6f 



h{6) d{ey h{e) 2 d{ey d{ey 

Wh{0)d{6) + f Y 2{9)h{0) + h(0)f B / m - Iy{6) 2 



Nd{9) 2 

The claim now follows by forming a common denominator. 



□ 



The denominator given in (|21l) in Lemma[T]has degree 2M + 2(M — 1) = AM — 2. 
The numerator in (|2"Tj) has degree 3(M — 1) + M = AM — 3; the highest degree term 
involves the within-group sum of squares W. The next two lemmas imply that, 
after cancelling common factors, the numerator of the univariate rational function 
from Lemma [T] has the degree claimed in the ML degree formula from Theorem [TJ 

Lemma 2. If m t = 1, then (1 + n t 9) divides the numerator of the rational equa- 
tion (|2ip . Hence, the polynomial d\ (9) of degree M — Mi divides this numerator. 

Lemma 3. If d\{9) is cleared from both the numerator and the denominator of 
the rational function given in (|21[) . then the new numerator and denominator are 
relatively prime for generic sufficient statistics Y\, . . . , Ym, W, and Bj with rrij > 2. 

Proof of Lemma\^ Let rrit — 1. To show that (1 + n t 9) divides the numerator, it 
is sufficient to show (1 + n t 9) divides the sum of 

(29) N [hidfgy^d) - 2f Y {9)f l {9)g Y (9) + f Y {9) 2 9l {9) + fi(9) 2 g B/m (9)} 
and 

- fi(e) 2 [f Y 2 (9)h(9) - f Y (Bf + h(9)f B/m (9)}. 



(30) 



The product fi(9) 2 g Y 2(9) in the first term of 

M 



} J miUi ]^[(1 + rijO) 



mini 

i=l j^i 
M M M 



M 

y^m fc n fc ]~[(l 

k=l tyk 



me) 



may be rewritten as 

M 

Y,m r n 2 r Y r 2 Y[{l + n s 9) 2 

r—1 s^r 



EEE< 



( m.mkmrn.nknlY 2 TJ(1 + n 3 6) TJ(1 + ntf) JJ(1 + n s 9) 2 

i—1 k—1 r—1 j^i l^k s^r 



Combining this expression with the analogous expansions of the other three terms 
shows that the polynomial in (|29|) is equal to N times 



M M M r 



(3D EEE 



(m r Y 2 — 2m r YiY r + m r YiYk + B r )mimkninkn 2 



i=l k=l r=l 

jy^i l^k s^r 

The polynomial in (|30p can be expanded similarly. We find 
(32) fY*(0)fi(9) - } Y {9) 2 + fi(9)f B/m (9) 

M M 

= J2 E( mfc ^ 2 ~ m kYiYk + Bk)m t n t n k J|(l + n 3 9) TJ(1 + n t 9) 



n s 9f 



i=l k=l 



Ijtk 
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Expanding fi(8) 2 as well, we obtain that the polynomial in ([30]) is equal to 



M M M M 

(33) -EEEE 

i— 1 k—1 r—1 u—1 



(m fe y"j - m k YiY k + B k )mim r m u nin k n r n u 

+ nj 6) Y[(l + ni 0) JJ(1 + n s 9) Yl (1 + n v 9) 

j^i L^k s^r v^u 

Now notice that (1 + n t 9) divides every summand in (pTTj) and (1551) unless i = 
k = r = t in the first summation, or i = k = r = u = t in the second summation. 
So it suffices to only consider these 'diagonal' terms. However, under the equality 
of indices, the quadratic expressions in the averages Yj cancel. Hence, the terms 
missing a factor of (1 + ritO) in ([29)) and (|30|) sum to 

(34) B t n t m 2 t ni {N - m t )\{{l + n 3 6f . 

Throughout the paper, we assume that we have at least two groups with at least 
one group size Tii > 2. Moreover, for generic data, Bt = if and only if mt — 1. 
Hence, for generic data, the expression in (f34|) is zero if and only if mt — 1. We 
conclude that d\{&) divides the numerator of the rational function in ([2~Tj). □ 

Note that the last part of the above proof shows not only that d\ (9) divides the 
numerator of (|2T|) . but that (1 +n t 9) does not divide the numerator when B t ^ 0, 
which holds generically if mt > 2. 

Proof of Lemma\3i Clearing d\{9) from the denominator in (|21|) yields the polyno- 
mial Ndt{9)d(9)fi (9) 2 . From the preceding comment, we know that c?2 (0) and the 
numerator are relatively prime for generic data Yi, . . . , Ym, W > 0, and Bj > 
with rrij > 2. To establish our claim, we will first show that fi{9) does not share 
a common factor with the numerator by showing fi(9) and /V(#) 2 .9i(#) to be rela- 
tively prime; all terms other than fy{0) 2 gi(9) in the numerator of (|21|) are multiples 
of fi(9). Then, we will show that after clearing d\(6) in ([2"T]) . d\(9) and the new 
numerator are relatively prime. 

Let Ox,..., 9m-i be the (possibly complex) roots of the degree M — 1 polynomial 
fi(9). For each 1 < k < M — 1, consider the linear form fy(0 k ) in the polynomial 
ring C[Yx, . . . ,Y M ]. Let V(f Y (9 k )) C C M be the zero locus of f Y (0 k ). Each set 
V(/y(0k)) is a hyperplane of dimension M — 1. Thus, the union U^f = ^ 1 l / (/Y(0/c)) 
is an M — 1 dimensional algebraic subset of <C M . A generic vector of group means 
(Yi, . . . , Ym) lies outside this lower-dimensional set, which means that fi{9) and 
fy{d) are relatively prime for generic data. 

To show that f± (9) and g\ (9) are relatively prime, assume #o = a + ib is a root 
of fi{9) and gi(9). Since <?i (0) is a sum of squares that is positive on K, we must 
have 9o ^ K and hence 6^0. Without loss of generality, let ni be the least of the 
group sizes Uj. Rewriting fi(0o) = 0, we get 



m i Ilj^iC 1 + "A) ~^ mi(l + n^o) 
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The imaginary part of the right side of this equation must equal since n\ is an 
integer. Substituting a + ib for #o, the imaginary part of (j35|) is 



5 E 



m < \ i \ 

mirii\ (Tii — m) 



2 ' 



. , x mi J (1 + riia) 2 + (mb) 

Since each term in the sum is positive, we obtain that 6 = 0. Consequently 9q 6 K, 
which is a contradiction. Therefore, f\{9) and g\{9) are relatively prime. 

It remains to show that the numerator and denominator obtained by clearing the 
factor d\{9) in (|2Tj) are relatively prime for generic data. We claim that if m t = 1 
then (1 + n t 9) divides 

f ™ W) 2 9y*{0) - 2f Y {6)h{e) 9Y (e) + .fY(o) 2 gi(e)h(e) 2 g B/m (e) 

(36) m ' 

while di(9) and 
(37) 

w , , m , , a s . /y 2 (0)/i(0) - Iy(9) 2 + fi(0)f B / m (9) 

Wfc(9)d 2 (9) + ' =: Wfi{9)d 2 {9) + F(9) 

are relatively prime for generic data. 

The ratio in ([551) equals (|5Tj) divided by d± (9) . We may rewrite ([3T|) as 



M M M 

(m r Y 2 — m r YiY r — vn r Y^Y r + m r YiYk + B^miirikninkn' 2 . 



(38) EEE 



i—l k—1 r—1 L 

x + n o e ) IB 1 + n( i + n ^ 

j^i l=£k s^r 

It is clear that the square (1 + nt9) 2 divides all terms in the sum (l38l) except those 
for r = i — t or r — k = t. However, the quadratic form in the averages Yi vanishes 
if r = i or r = k. Since the terms in question have r — t, and B r = B t = because 
m t — 1, we conclude that (1 +rit9) 2 divides the entire sum which proves that 
d\(9) divides the ratio in (|36l) . 

We are left to show that d\{6) and W fx{9)d%{6) + F(9) are relatively prime 
for generic data. Let 9\, . . . , 9m-M 2 be the roots of di(9); each root is equal to 
— 1/rii for some index i. Since the n, are distinct, no root of d\{9) is a root of 
d%{9). Moreover, it is easy to see that no root of d\{9) is a root of fi(9). Now 
let I be the ideal generated by the M - M 2 polynomials W fi(9k)d2{9k) + F(9k) 
in the polynomial ring C[W, Y%, . . . Ym, B[, . . . B' M ], where the B[ stand for the 
between-group sums of squares Bi with multiplicity rrn > 2. Pick sufficient statistics 
W = Yi = ... = Ym 7^ and B\ = . . . = B' M = 0. Since no root of d\(9) is a root of 
d 2 {9) or /i(0), (EH) implies that for these special data M / /i(6» fc )d 2 (6'fe) + ^(^fe) ^ 
for each k. The zero locus V(I) is thus a proper algebraic subset of C M + M 2+i_ S uc h 
a set is of lower dimension and, thus, d\(9) and W fi(9)d 2 (9) + F(9) are relatively 
prime for generic data. □ 
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4. Proof of formula for REML degree 



For the proof of the REML degree formula in Theorem!!] we proceed in the same 
way as for the ML degree. We begin by deriving the univariate rational function 
whose number of roots is the REML degree. 

Lemma 4. Consider the rational junction whose numerator is 

(39) ( 9l (9) - fi(ef)[W h{9)d{9) + fy*(0)M9) ~ M e ? + /iW/fl/m]+ 

(N - 1) [/i(0)V»(*) - 2f Y (9)f\(9)g Y (9) + f Y (0) 2 gi(9) + h(9) 2 g B/m (9)] 



and denominator is 

(40) d(9)h(9) [Wh(9)d(9) + f Y ,{9)h(0) - fy(6? 



l(0)fB/r, 



The REML degree is the degree of the numerator of this rational function after 
clearing common factors from the given numerator and denominator. 

Proof. The equation di/dn = has the unique solution 

m 



compare (| 19|) . Substituting k(9) into the partial derivative di/d9 yields the uni- 
variate function 



M 
i=l 



mini 



EM mini 
»=1 (l+n z 8)' 2 



X^ M 1 
2-fi=l 1 



,1/ 



^ (1 + ni 



0; 



recall (|20|) . We can now simplify and rewrite (1411) . forming a common denominator, 
to obtain the desired rational function. □ 



The degree of the numerator in Lemma 0] is AM — 3, but it shares common factors 
with the denominator. In fact, in the proof of Lemma we have shown that d\{9) 
divides f Y »(0)fi(9) - fY{0) 2 + fi(9)f B/m . Thus, di(6») 2 divides the denominator 
from (|40[). To prove Theorem [T] it remains to prove the following two facts. 

Lemma 5. The polynomial d\(9) 2 divides the numerator (|39[) . 

Lemma 6. After clearing d\{9) 2 from (|39[) and (|40|) . £/ie new numerator and new 
denominator are relatively prime for generic data. 

Proof of Lemma [5[ From the proof of Lemma [21 we know that d\ (9) divides the 
polynomial f Y i{9)fi{9) — f Y {9) 2 + /i(0)/s/ m . Moreover, as shown in the proof of 
Lemma [31 the square d\{9) 2 divides 

fi(9) 2 g Y 2(9)-2f 1 (9)f Y (9)g Y (9) + f Y (9) 2 g 1 (9)+f 1 (9) 2 g B/m (9). 
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To complete the proof of the present lemma, it suffices to show that d\ (9) divides 
gi{6) — fi(9) . However, with some distributing and grouping, we see 

9l (e)-h{ef = 

M M M 

= ^m l n 2 \\{l + n J 9) 2 - ^ ^ mim k n t n k ]J(f + nj 6) J|(f + njO) 

i—1 j^i i—1 k—l j^i l^k 

M M M 

= ^(m, - mj) + n 3 9) 2n * Uk + M) II^ 1 + 

i—1 j^i i—1 k>i j^i l^k 

which is divisible by (f + n t 9) if and only if = 1. □ 

Proof of Lemma\Q We first show that if m t > 2, then, for generic data, (1 + n t 6) 
and the numerator from (|39[) are relatively prime. Consider 



(42) ( 9l (8) - h(6) 2 ) [f Y2 {9)h{9) - f Y (0) 2 + h{9)f B/m {9)} + 

(N - l)[/i(0)V*(0) ~ 2f Y (9)f 1 (9)g Y (9) + f Y (9) 2 9l (9) + g B /m (9)h{9) 2 ]. 

Using the results from the proof of Lemma [5] and writing out the involved summa- 
tions, (l4"2l is seen to be equal to 

(M M M 
} J ^ *^2(m k Y 2 - m k YiY k + B k )m l m r n l n k n 2 r 
i=l k= l r=l 

+ n 3 9) + ni 9) + n s 9) 2 

j^i l^k Sy^r 

(M M M M 
^ ^2 X] X]^ mfc ^ 2 ~~ m kYiYk + B k )mim r m u n t n k n r n u 
i—1 k—l r—1 u—1 

+ n 3 9) + ni 9) + n s 9) ]J (1 + n v 9) 

j^i l^k s^r v^u 

' M M M 

+ (N — 1) y^(m r y,. 2 — 2m r YiY r + m r YiY k + B r )mim k riin k n 2 

2=1 fc=l r=l 



Y[(l + n 3 9) + ni$) + n s 9f 

j^i l^k s^r 

The factor (l + n t 9) divides every summand in the above summations unless t = i = 
k = r = u, so it suffices to only consider these terms. Letting t = i = k = r = u, the 
terms missing a factor of (1 + n t 9) sum to a term we already encountered, namely, 
that in (|3"4")l . The discussion following display (|3"4"|) shows that if the data is generic 
and m t > 2, then (1 + n^f?) does not divide the numerator given in (|30[) . 

Continuing to work through the factors of the denominator from (|40|) , assume 
that 9q is a root of fi(9). Then everything vanishes in the numerator except for two 
terms -gi(9 )f Y {9 ) 2 and {N~l)f Y {9 ) 2 9l {9 ), which add to (N-2)f Y (9 Q ) 2 9l (9 Q ). 
From the proof of Lemma El we know jy (#o) 2 gi(#o) ^ for generic data, so since 
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we are working under the assumption of at least two groups and at least one group 
size rii > 2, the numerator and fi(9) are relatively prime for generic data. 
Finally, we need to show 

H(9) := fi(9fg Y *(9) - 2h(9)f Y (9)g Y (9) + f Y (e) 2 gi(8) + fi(9) 2 g B/m (&) 

and 

G(9) :=Wfi(9)d 2 (9)+F(9), 
are relatively prime for generic data W, Y\, . . .Ym, and Bi with m, > 2; the 
polynomial F(9) was defined in ([37]) . We will again denote the between-group sums 
of squares with multiplicities m, > 2 as B[ , . . . , B' M} . By a standard algebraic 
results, the polynomials G(9) and share a common root 9 if and only if 

a certain polynomial in their coefficients vanishes; this polynomial is called the 
resultant and we denote it by Res(G, H). Since both H(9) and G(9) have coefficients 
that are polynomials in the sufficient statistics W , Y\ , . . . Ym , and B[ , . . . , B' M2 , we 
may regard Res(G, H) as a polynomial in the ring C[W, Y\,... Ym, B[, . . . , B' Mo ]. 
By Lemma El for any given generic choice of Y\ , . . . , Ym > -B( , . . . , B' M , a root # of 
H is not a root of fi(9) or d 2 (9). Hence, 9o is a root of G if and only if 

(44) W = F{do) . 

Picking W not to satisfy (|44)l shows that Res(G, H) is not the zero polynomial in 
C[W, Y\, . . . Ym, B[, . . . , B'm ]• Hence, the zero locus of Res(G, H) is a set of lower 
dimension, and we conclude that H and G are relatively prime for generic data. □ 

5. General mean structure in the one-way layout 

The one-way layout as specified in ([1} postulates a common mean fi for all obser- 
vations Yij. Often the interest is instead in a more general mean space. Formally, 
consider the model 

(45) Y^ = My +«i + i=l,...,q, j = 1,.. . .n*, 

where the random effects oti ~ Af(0, t) and the error terms e y ~ A/"(0, w) are again 
all mutually independent. However, the array of means (/xy) may now belong to a 
linear subspace of M. N that we assumed to be spanned by the independent columns 
of a full rank design matrix X € ~R Nxp ; as before, TV = m + • ■ ■ + n q is the sample 
size. In other words, 

(46) vec( fMj )=Xp 

for some unknown (fixed) mean parameter vector /3 GW. 

ML and REML estimation with more general mean structure can be approached 
algebraically in the exactly the same way as before. It is convenient to reparametrize 
the covariance matrix in terms of k = l/u) and 9 = t/oj. For known covariancc 
parameters, the ML estimate 0(9) of /3 is obtained by generalized least squares and 
depends on 9 but not on k. For fixed 9, it is then also straightforward to solve 
the ML or REML equations for k. This way we may reduce algebraic solution of 
the likelihood equations to solving a single rational equation in 9. In this section 
we demonstrate that the involved algebraic computations are feasible in a larger 
example. Before going into the details of the example, we would like to offer the 
following conjecture based on numerical experiments with smaller models and ran- 
domly chosen design matrices. It states that the ML and REML degrees for the 
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model specified by (|45|) and (j46|) cannot exceed the largest possible respective de- 
grees in the model with common mean \i. Recall that the largest degrees arise in 
the entirely unbalanced case with group sizes m, . . . , n q that are pairwise distinct. 

Conjecture 1. For any design matrix X £ M. Nxp that has the vector (1, . . . , 1) T 
in its column span span(X), the ML degree for the one-way layout with mean space 
span(X) and q random group effects is bounded above by 3g — 3. Similarly, the 
REML degree is bounded above by 2q — 3. 

According to this conjecture, the degrees would grow only linearly with the num- 
ber of groups, which would suggest that a moderately large number of unbalanced 
groups can be handled in algebraic computations. 

Example 3. With the goal of providing linguistic support for an African origin of 
modern humans, Atkinson [AtkTT] fits regression models to data on the phonemic 
diversity of languages. The data, which can be obtained from the journal's online 
supplementary material, concern N = 504 languages that are classified into q = 109 
language families. Besides quantitative summary measures of phonemic diversity, 
the available information includes the size of the population speaking each language 
and the distance between a chosen center for each language and an inferred origin 
in Africa, the latter being the main covariate of interest. 

One model of interest in this application is a one-way layout with groups corre- 
sponding to the language families. The response Yjj is the phonemic diversity of 
the jth language in the ith family, which, as in (|45l) and (14T>1) . is modelled as 

(47) Y ij =j3 +/3 1 log(Pi j )+f32D ij +ai+e ij , i = l,...,q, j = l,...,m. 

Here, i\j stands for the population size and Dij is the distance from the origin in 
Africa. As can be expected, the data is unbalanced. The group sizes rii, . . . , niog 
fall into the range from 1 to 62. There are M = 17 distinct group sizes of which 
M2 = 9 have multiplicity two or larger. Hence, by Theorem [TJ the one-way layout 
with all means equal has ML degree 57 and REML degree 49. However, as we show 
next, the mean structure can affect the ML and REML degree. 

Computations we did using the software Maple show that the ML degree of the 
model given by (|47|) is 83, whereas the REML degree is 71. Exact computations 
in analogy to the ones given in Example [1] produce large integer coefficients, too 
large to display on paper but easily handled by a computer. Solving the polynomial 
equations for ML and REML numerically, each equation is seen to have a unique 
positive root, namely, 

(48) §ml ~ 0.3706 and (9 RE ml « 0.3853. 

Each root gives a local and, thus, global maximum of the concerned likelihood func- 
tion. We remark that the ML equation has twelve negative real roots. The REML 
equation has no other real roots. Running the numerical optimizers implemented 
in the R package lme4 yields estimates that agree with (|48|) in all the given digits. 
As in Example [1] the fact that our two univariate polynomials each have a unique 
positive root manifests itself in a single sign change in the coefficient sequence. Fi- 
nally, we remark that when omitting either the covariate log(P) or the covariate D, 
the ML degree drops to 72 and the REML degree drops to 61. 



1(5 



GROSS, DRTON, PETROVIC 



6. Balanced two-way layouts 

Suppose we have observations Yijk that are cross-classified according to two 
factors and model the observations in an additive two-way layout as 

(49) Y ijk = n + ai+0j + e ijk , 

i = l,...,r, j = l,...,q, k = l,...,n. 

The terms on ~ A/"(0, t{) and j3j ~ A/"(0, T2) are normally distributed random effects. 
The error terms are distributed as e^k ~ JV(0,uj), and all the random variables ai, 
(3j and e^k are mutually independent. Finally, there is one (fixed) mean parameter 
/1 £ I. A related model is obtained by including random interaction terms 7^ ~ 
7V(0,Ti2) in the defining equations 

(50) Yijk = ^ + cti + j3j + jij + e ijk , 

i = l,...,r, j = l,...,q, k = l,...,n. 

The interaction terms 7^ are again mutually independent and independent of all 
other random variables appearing on the right hand side of (|50[) . 

The models in (|4T)|) and ([50]) are balanced; the groups of observations Y i} ;1 , . . . , Yij n 
specified by the different index pairs are all of size n. It is known that 

REML leads to closed form estimates for each of the two balanced models; compare 
[Hoc851 [SCM921 ISO04] . In other words, the REML degree of either model is one. 
ML estimation, however, presents a non-trivial algebraic problem. The ML degree 
can be derived using Grobner basis calculations, and we sec that balanced two-way 
layouts have closed form ML estimates in the sense of Cardano's formula. 

Theorem 2. The ML degree of balanced additive two-way layout with random 
effects is four. The same holds for the model with random interaction. 

Proof. Define the sum of squares 

r 

SSA = J2 ( in( Y i- -Y...) 2 , 

i=i 

r 

SSB=J2 rn (Yj.-Y..) 2 , 

3 = 1 
r q 

SSAB =Y,Y, n (% " ^- " + ^--) 2 ' 

i=i j=i 

r q n 
i=l j = l k=l 

where we use the convention that the overbar indicates that an average was formed 
and the '.' subscripts specify which indices were averaged over. 

(No interaction) The ML equations for the additive model given by (|4T))) are 
derived, for instance, in Chapter 4.7.d of SCM92 and in Chapter 3 of SO04 . One 
equation leads to the ML estimator 
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The rational equations for the variance components may be written as 

rqn-r-q+l 1 SSAB+SSE 

( 51 ) T T = 2 ' 

lu lu + qnTi + rnTi ui 

(52) + 1 ^ 



(53) 



u> + qnr\ lu + qnr\ + rm"2 (w + gnri) 2 ' 

g - 1 1 555 
— + = 7 - 

cj + rnT2 lu + qnr\ + rnr 2 [lu + rnr2 / 



2 ' 



Clearing the common denominators lu 2 , (lu + qnri) 2 , (w + r/iT 2) 5 and w + gwi + 
rn72 gives a polynomial equation system. However, multiplying each equation 
with the relevant product of these denominators introduces new solutions that 
are not solutions of the original rational equations. Using saturation as explained 
in Chapter 2 of [DSS09] . we can remove these extraneous solutions and obtain a 
polynomial equation system of degree 4. (We remark that software such as Maple 
is able to produce a lexicographic Grobner basis over the field of fractions in r, q, 
n, and the four sums of squares.) 

(With interaction) Chapter 4.7.d of |SCM92] also gives the ML equations for 
the model with interaction defined by ([5171); see & l so Chapter 4 of |SO04| . Two 
equations determine the ML estimators 

- v SSE 



rq(n — 1) 

The rational equations for the remaining variance components can be written as 
(r-l)(g-l) 1 _ SSAB 



(54) 
(55) 
(56) 



lu + rvryi lu + qnTi + ttvt% + nr 12 (lu + nri 2 ) 2 ' 
r - 1 1 55,4 



qnri + TIT12 lu + qnri + rnr 2 + nri 2 {lu + qnTi + nTi 2 ) 2 ' 
q-l 1 SSB 



lu + rnr 2 + nTi 2 lu + qnri + rnT 2 + nTi 2 (lu + rnr 2 + nri 2 ) 2 



Clearing the denominators carefully via saturation yields a polynomial equation 
system of degree 4. (Again, a lexicographic Grobner basis can be obtained with r, 
q, n and the sums of squares as parameters to the equations.) □ 



We briefly illustrate algebraic computation of the ML estimators in an example 
that involves the additive two-way layout. 

Example 4. The R package lme4 contains data from experiments for an assessment 
of the variability between samples of penicillin. The data are described in detail 
in |DG72| . The response is a diameter measurement of the zone in which growth 
of an organism is inhibited by the penicillin. The experiments are cross-classified 
according to the assay plate and the penicillin sample used. The former is a factor 
with r = 24 levels, the latter has q = 6 levels. There are no replications to be 
considered in this case, that is, n — 1. We will consider the additive model for 
which the relevant sums of squares are 

SSA = 105f , SSB = 449|, SSAB + SSE = 34|. 
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Using the saturation computation alluded to in the proof of Theorem [2j we 
obtain the polynomial equation system 

204808595904 w 4 - 1801205257140 w 3 + 2545119731943 uj 2 - 1070402996440 w 

+ 139045932165= 0, 
2481278604010272 ti + 507582172417738176 uj 3 - 4309720916424828084 a; 2 

+4998133978544934251 u - 1133204709683307975 = 0, 
2481278604010272 r 2 + 534435082556924736 J 3 - 4538697213124439100 uj 2 

+5270402449572117709^- 1201351121037374475 = 0. 

This polynomial system has the same solution set as the original rational ML equa- 
tions. The polynomials on the left hand sides of the equations form a lexicographic 
Grobner basis and are readily solved. First, solve the quartic equation in u). Next, 
plug each of the four solutions for w into the other two equations and solve the 
resulting linear equations for t\ and T2, respectively. In the present example, all 
four solutions are real but only one is feasible with ui, Ti, t% > 0. This solution is 

lo = 0.302425, n = 0.714992, f 2 = 3.135188. 

It defines the unique global maximum of the likelihood function. 

7. Conclusion 

This paper takes a first step towards understanding the algebraic complexity of 
ML and REML estimation in linear mixed models. Our main results in Theorem Q] 
concern the unbalanced one-way layout with common mean for all observations. It 
would be interesting to generalize the results to one-way classifications with more 
complicated mean spaces; recall Conjecture [TJ Similarly, it would be interesting 
to study unbalanced two- and higher-way layouts, although these models would 
require more sophisticated mathematical treatment because it is no longer possible 
to analyze a single univariate rational equation; compare Section [6] 

A remarkable feature common to Examples [1] and is that Descartes' rule of 
sign applied to a univariate polynomial in the variance ratio 9 reveals that there 
is a unique feasible solution to the ML/REML equations. The same was true for 
many other examples of unbalanced one-way classifications that we computed. This 
said, we also saw cases with more than one sign change and the number of positive 
solutions for not matching up with the sign changes. 

To our knowledge, the literature does not supply many examples of linear mixed 
models with multimodal likelihood functions. We conclude by giving two simulated 
examples that demonstrate the mathematical possibility of more than one mode. 
Such examples were rare in our simulations, which is in agreement with findings of 
SM84] who also treat the unbalanced one-way layout. While uniqueness of local 
optima is not explicitly discussed in [SM84] , the authors remark in their conclusion 
that "varying the iteration starting point slightly affects the rate of convergence, 
but not the [mean square errors] or biases of the [ML and REML] estimators." The 
examples we give involve three positive roots to the ML or REML equations for the 
variance ratio 9. We do not know of examples with more positive roots. 
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Example 5. Consider the one-way layout with a single grand mean /j, from |T]). 
Take q = 5 groups of sizes 

ni = 2, ri2 =5, = 10, 7J4 = 20, n$ = 50. 

Let the sufficient statistics be the five group averages 

= ~Wk « ~ 5 - 1546 ' ^ = H| « 0.1759, 

? 3 = -Hfi « -0.1441, n = s « °- 1541 ' 

^ = -ffi « -0.6514, 
and the within-group sum of squares 

W = « 276.69. 

The univariate ML equation in 9 has three nonnegative solutions, namely, 
"ML.l « 0.00838738, « 0.118458, « 0.338944; 

having specified six digits we should add that the solutions were computed treating 
the above rational fractions as the input. The solution #ml.i yields the global max- 
imum of the likelihood function, whereas 9ml, 2 and #ml,3 determine a saddle point 
and local maximum, respectively. In contrast, the restricted likelihood function has 
a unique local and global maximum for 

0REML « 0.771763. 

The data was simulated from the model with mean /j,q = 0, and variance compo- 
nents To = 3 and ojq = 2, which gives 9q = 3/2. 

Example 6. Continuing with the setup from Example change the sufficient 
statistics to 

V — 230081 ^ ^ 7O0R V — 721282 _ n 1 901 

y l - 40206" ~ 5.722b, Y 2 - 5630371 ~ U.1281, 

^3 = Hi « 0.3064, F 4 = ifif « 0.4045, 



and 



= ^fip ~ 429.22. 
1759 



Now, all real solutions to the ML equations are negative. Thus, the global maximum 
of the likelihood is achieved at the boundary point 9ml = 0. In contrast, the REML 
equations have three feasible solutions for 9, namely, 

$reml,i ~ 0.00492193, ^reml,2 ~ 0.159465, 0reml, 3 « 0.2414611. 

The solution #r,eml,i gives the global maximum of the restricted likelihood function. 
The solutions #reml,2 and #reml,3 determine a saddle point and a local maximum, 
respectively. The data was simulated as in Example [5] 

Readers experimenting with the two examples just given will find the likelihood 
functions to be rather flat between the three stationary points, which give log- 
likelihood values that differ by less than 0.1. 

In both Example [5] and Example [6l the first group is of the smallest size but has 
group mean that is largest in absolute value. The other means are comparatively 
close to each other. We experimented with permuting the means, while holding 
the group sizes fixed. In Example El eight out of 120 permutations give bimodal 
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restricted likelihood functions. Two permutations yield three positive roots to the 
REML equations. The other six cases have two positive roots, and one of the two 
local maxima occurs for 8 — 0. The eight permutations generate the group of per- 
mutations that keep the first mean fixed. In this example, there is clearly negative 
correlation between the group sizes Hi and the group means Yi. (In practice, such 
dependence could arise from selection effects.) The eight permutations of inter- 
est turn out to give the eight most negative correlations between group sizes and 
means. In similar experiments for Example [5j which features positive correlation 
between group sizes and means, bimodal likelihood functions are obtained for 18 
permutations. Again, these permutations keep the first mean fixed. Only three 
permutations give three positive roots to the ML equations. The 18 permutations 
include the top six permutations in terms of large positive correlation but also the 
permutation whose associated correlation ranks 43rd. 

While dependence between group means and sizes plays a role in Examples [S] 
and[6j the precise interplay between them appears to be subtle. For instance, when 
varying the mean Y\ in Example [5] and keeping all other sufficient statistics fixed, 
we find that there are three positive roots to the ML equations when —5.47 < 
Y\ < —5.08 but a unique root otherwise; we experimented with a grid of values 
in [—10, 10]. In particular, the likelihood function is unimodal for larger negative 
values of Y\. It would be interesting, but presumably difficult, to get a better 
understanding of the semi-algebraic set of sufficient statistics that give (restricted) 
likelihood functions with more than one local maximum. 
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